IEICE global.ieice.org Site

Keyword Search Result

[Keyword] neural networks(287hit)

81-100hit(287hit)

Deep Neural Network Based Monaural Speech Enhancement with Low-Rank Analysis and Speech Present Probability
Wenhua SHI Xiongwei ZHANG Xia ZOU Meng SUN Wei HAN Li LI Gang MIN

LETTER-Noise and Vibration

Vol:
E101-A No:3
Page(s):
585-589
A monaural speech enhancement method combining deep neural network (DNN) with low rank analysis and speech present probability is proposed in this letter. Low rank and sparse analysis is first applied on the noisy speech spectrogram to get the approximate low rank representation of noise. Then a joint feature training strategy for DNN based speech enhancement is presented, which helps the DNN better predict the target speech. To reduce the residual noise in highly overlapping regions and high frequency domain, speech present probability (SPP) weighted post-processing is employed to further improve the quality of the speech enhanced by trained DNN model. Compared with the supervised non-negative matrix factorization (NMF) and the conventional DNN method, the proposed method obtains improved speech enhancement performance under stationary and non-stationary conditions.
Corpus Expansion for Neural CWS on Microblog-Oriented Data with λ-Active Learning Approach
Jing ZHANG Degen HUANG Kaiyu HUANG Zhuang LIU Fuji REN

PAPER-Natural Language Processing

Pubricized:
2017/12/08
Vol:
E101-D No:3
Page(s):
778-785
Microblog data contains rich information of real-world events with great commercial values, so microblog-oriented natural language processing (NLP) tasks have grabbed considerable attention of researchers. However, the performance of microblog-oriented Chinese Word Segmentation (CWS) based on deep neural networks (DNNs) is still not satisfying. One critical reason is that the existing microblog-oriented training corpus is inadequate to train effective weight matrices for DNNs. In this paper, we propose a novel active learning method to extend the scale of the training corpus for DNNs. However, due to a large amount of partially overlapped sentences in the microblogs, it is difficult to select samples with high annotation values from raw microblogs during the active learning procedure. To select samples with higher annotation values, parameter λ is introduced to control the number of repeatedly selected samples. Meanwhile, various strategies are adopted to measure the overall annotation values of a sample during the active learning procedure. Experiments on the benchmark datasets of NLPCC 2015 show that our λ-active learning method outperforms the baseline system and the state-of-the-art method. Besides, the results also demonstrate that the performances of the DNNs trained on the extended corpus are significantly improved.
End-to-End Exposure Fusion Using Convolutional Neural Network
Jinhua WANG Weiqiang WANG Guangmei XU Hongzhe LIU

LETTER-Image Recognition, Computer Vision

Pubricized:
2017/11/22
Vol:
E101-D No:2
Page(s):
560-563
In this paper, we describe the direct learning of an end-to-end mapping between under-/over-exposed images and well-exposed images. The mapping is represented as a deep convolutional neural network (CNN) that takes multiple-exposure images as input and outputs a high-quality image. Our CNN has a lightweight structure, yet gives state-of-the-art fusion quality. Furthermore, we know that for a given pixel, the influence of the surrounding pixels gradually increases as the distance decreases. If the only pixels considered are those in the convolution kernel neighborhood, the final result will be affected. To overcome this problem, the size of the convolution kernel is often increased. However, this also increases the complexity of the network (too many parameters) and the training time. In this paper, we present a method in which a number of sub-images of the source image are obtained using the same CNN model, providing more neighborhood information for the convolution operation. Experimental results demonstrate that the proposed method achieves better performance in terms of both objective evaluation and visual quality.
Daily Activity Recognition with Large-Scaled Real-Life Recording Datasets Based on Deep Neural Network Using Multi-Modal Signals
Tomoki HAYASHI Masafumi NISHIDA Norihide KITAOKA Tomoki TODA Kazuya TAKEDA

PAPER-Engineering Acoustics

Vol:
E101-A No:1
Page(s):
199-210
In this study, toward the development of smartphone-based monitoring system for life logging, we collect over 1,400 hours of data by recording including both the outdoor and indoor daily activities of 19 subjects, under practical conditions with a smartphone and a small camera. We then construct a huge human activity database which consists of an environmental sound signal, triaxial acceleration signals and manually annotated activity tags. Using our constructed database, we evaluate the activity recognition performance of deep neural networks (DNNs), which have achieved great performance in various fields, and apply DNN-based adaptation techniques to improve the performance with only a small amount of subject-specific training data. We experimentally demonstrate that; 1) the use of multi-modal signal, including environmental sound and triaxial acceleration signals with a DNN is effective for the improvement of activity recognition performance, 2) the DNN can discriminate specified activities from a mixture of ambiguous activities, and 3) DNN-based adaptation methods are effective even if only a small amount of subject-specific training data is available.
Feature Adaptive Correlation Tracking
Yulong XU Yang LI Jiabao WANG Zhuang MIAO Hang LI Yafei ZHANG

LETTER-Image Recognition, Computer Vision

Pubricized:
2016/11/28
Vol:
E100-D No:3
Page(s):
594-597
Feature extractor plays an important role in visual tracking, but most state-of-the-art methods employ the same feature representation in all scenes. Taking into account the diverseness, a tracker should choose different features according to the videos. In this work, we propose a novel feature adaptive correlation tracker, which decomposes the tracking task into translation and scale estimation. According to the luminance of the target, our approach automatically selects either hierarchical convolutional features or histogram of oriented gradient features in translation for varied scenarios. Furthermore, we employ a discriminative correlation filter to handle scale variations. Extensive experiments are performed on a large-scale benchmark challenging dataset. And the results show that the proposed algorithm outperforms state-of-the-art trackers in accuracy and robustness.
An Improved Supervised Speech Separation Method Based on Perceptual Weighted Deep Recurrent Neural Networks
Wei HAN Xiongwei ZHANG Meng SUN Li LI Wenhua SHI

LETTER-Speech and Hearing

Vol:
E100-A No:2
Page(s):
718-721
In this letter, we propose a novel speech separation method based on perceptual weighted deep recurrent neural network (DRNN) which incorporate the masking properties of the human auditory system. In supervised training stage, we firstly utilize the clean label speech of two different speakers to calculate two perceptual weighting matrices. Then, the obtained different perceptual weighting matrices are utilized to adjust the mean squared error between the network outputs and the reference features of both the two clean speech so that the two different speech can mask each other. Experimental results on TSP speech corpus demonstrate that the proposed speech separation approach can achieve significant improvements over the state-of-the-art methods when tested with different mixing cases.
Multi-Channel Convolutional Neural Networks for Image Super-Resolution
Shinya OHTANI Yu KATO Nobutaka KUROKI Tetsuya HIROSE Masahiro NUMA

PAPER-IMAGE PROCESSING

Vol:
E100-A No:2
Page(s):
572-580
This paper proposes image super-resolution techniques with multi-channel convolutional neural networks. In the proposed method, output pixels are classified into K×K groups depending on their coordinates. Those groups are generated from separate channels of a convolutional neural network (CNN). Finally, they are synthesized into a K×K magnified image. This architecture can enlarge images directly without bicubic interpolation. Experimental results of 2×2, 3×3, and 4×4 magnifications have shown that the average PSNR for the proposed method is about 0.2dB higher than that for the conventional SRCNN.
Joint Optimization of Perceptual Gain Function and Deep Neural Networks for Single-Channel Speech Enhancement
Wei HAN Xiongwei ZHANG Gang MIN Xingyu ZHOU Meng SUN

LETTER-Noise and Vibration

Vol:
E100-A No:2
Page(s):
714-717
In this letter, we explore joint optimization of perceptual gain function and deep neural networks (DNNs) for a single-channel speech enhancement task. A DNN architecture is proposed which incorporates the masking properties of the human auditory system to make the residual noise inaudible. This new DNN architecture directly trains a perceptual gain function which is used to estimate the magnitude spectrum of clean speech from noisy speech features. Experimental results demonstrate that the proposed speech enhancement approach can achieve significant improvements over the baselines when tested with TIMIT sentences corrupted by various types of noise, no matter whether the noise conditions are included in the training set or not.
Using a Single Dendritic Neuron to Forecast Tourist Arrivals to Japan
Wei CHEN Jian SUN Shangce GAO Jiu-Jun CHENG Jiahai WANG Yuki TODO

PAPER-Biocybernetics, Neurocomputing

Pubricized:
2016/10/18
Vol:
E100-D No:1
Page(s):
190-202
With the fast growth of the international tourism industry, it has been a challenge to forecast the tourism demand in the international tourism market. Traditional forecasting methods usually suffer from the prediction accuracy problem due to the high volatility, irregular movements and non-stationarity of the tourist time series. In this study, a novel single dendritic neuron model (SDNM) is proposed to perform the tourism demand forecasting. First, we use a phase space reconstruction to analyze the characteristics of the tourism and reconstruct the time series into proper phase space points. Then, the maximum Lyapunov exponent is employed to identify the chaotic properties of time series which is used to determine the limit of prediction. Finally, we use SDNM to make a short-term prediction. Experimental results of the forecasting of the monthly foreign tourist arrivals to Japan indicate that the proposed SDNM is more efficient and accurate than other neural networks including the multi-layered perceptron, the neuro-fuzzy inference system, the Elman network, and the single multiplicative neuron model.
Global Hyperbolic Hopfield Neural Networks
Masaki KOBAYASHI

PAPER-Nonlinear Problems

Vol:
E99-A No:12
Page(s):
2511-2516
In recent years, applications of neural networks with Clifford algebra have become widespread. Hyperbolic numbers are useful Clifford algebra to deal with hyperbolic geometry. It is difficult when Hopfield neural network is extended to hyperbolic versions, though several models have been proposed. Multistate or continuous hyperbolic Hopfield neural networks are promising models. However, the connection weights and domain of activation function are limited to the right quadrant of hyperbolic plane, and the learning algorithms are restricted. In this work, the connection weights and activation function are extended to the entire hyperbolic plane. In addition, the energy is defined and it is proven that the energy does not increase.
Speeding up Deep Neural Networks in Speech Recognition with Piecewise Quantized Sigmoidal Activation Function
Anhao XING Qingwei ZHAO Yonghong YAN

LETTER-Acoustic modeling

Pubricized:
2016/07/19
Vol:
E99-D No:10
Page(s):
2558-2561
This paper proposes a new quantization framework on activation function of deep neural networks (DNN). We implement fixed-point DNN by quantizing the activations into powers-of-two integers. The costly multiplication operations in using DNN can be replaced with low-cost bit-shifts to massively save computations. Thus, applying DNN-based speech recognition on embedded systems becomes much easier. Experiments show that the proposed method leads to no performance degradation.
Steady-versus-Transient Plot for Analysis of Digital Maps
Hiroki YAMAOKA Toshimichi SAITO

PAPER-Nonlinear Problems

Vol:
E99-A No:10
Page(s):
1806-1812
A digital map is a simple dynamical system that is related to various digital dynamical systems including cellular automata, dynamic binary neural networks, and digital spiking neurons. Depending on parameters and initial condition, the map can exhibit various periodic orbits and transient phenomena to them. In order to analyze the dynamics, we present two simple feature quantities. The first and second quantities characterize the plentifulness of the periodic phenomena and the deviation of the transient phenomena, respectively. Using the two feature quantities, we construct the steady-versus-transient plot that is useful in the visualization and consideration of various digital dynamical systems. As a first step, we demonstrate analysis results for an example of the digital maps based on analog bifurcating neuron models.
Speaker Adaptive Training Localizing Speaker Modules in DNN for Hybrid DNN-HMM Speech Recognizers
Tsubasa OCHIAI Shigeki MATSUDA Hideyuki WATANABE Xugang LU Chiori HORI Hisashi KAWAI Shigeru KATAGIRI

PAPER-Acoustic modeling

Pubricized:
2016/07/19
Vol:
E99-D No:10
Page(s):
2431-2443
Among various training concepts for speaker adaptation, Speaker Adaptive Training (SAT) has been successfully applied to a standard Hidden Markov Model (HMM) speech recognizer, whose state is associated with Gaussian Mixture Models (GMMs). On the other hand, focusing on the high discriminative power of Deep Neural Networks (DNNs), a new type of speech recognizer structure, which combines DNNs and HMMs, has been vigorously investigated in the speaker adaptation research field. Along these two lines, it is natural to conceive of further improvement to a DNN-HMM recognizer by employing the training concept of SAT. In this paper, we propose a novel speaker adaptation scheme that applies SAT to a DNN-HMM recognizer. Our SAT scheme allocates a Speaker Dependent (SD) module to one of the intermediate layers of DNN, treats its remaining layers as a Speaker Independent (SI) module, and jointly trains the SD and SI modules while switching the SD module in a speaker-by-speaker manner. We implement the scheme using a DNN-HMM recognizer, whose DNN has seven layers, and elaborate its utility over TED Talks corpus data. Our experimental results show that in the supervised adaptation scenario, our Speaker-Adapted (SA) SAT-based recognizer reduces the word error rate of the baseline SI recognizer and the lowest word error rate of the SA SI recognizer by 8.4% and 0.7%, respectively, and by 6.4% and 0.6% in the unsupervised adaptation scenario. The error reductions gained by our SA-SAT-based recognizers proved to be significant by statistical testing. The results also show that our SAT-based adaptation outperforms, regardless of the SD module layer selection, its counterpart SI-based adaptation, and that the inner layers of DNN seem more suitable for SD module allocation than the outer layers.
Design of Multilevel Hybrid Classifier with Variant Feature Sets for Intrusion Detection System
Aslhan AKYOL Mehmet HACIBEYOĞLU Bekir KARLIK

PAPER-Information Network

Pubricized:
2016/04/05
Vol:
E99-D No:7
Page(s):
1810-1821
With the increase of network components connected to the Internet, the need to ensure secure connectivity is becoming increasingly vital. Intrusion Detection Systems (IDSs) are one of the common security components that identify security violations. This paper proposes a novel multilevel hybrid classifier that uses different feature sets on each classifier. It presents the Discernibility Function based Feature Selection method and two classifiers involving multilayer perceptron (MLP) and decision tree (C4.5). Experiments are conducted on the KDD'99 Cup and ISCX datasets, and the proposal demonstrates better performance than individual classifiers and other proposed hybrid classifiers. The proposed method provides significant improvement in the detection rates of attack classes and Cost Per Example (CPE) which was the primary evaluation method in the KDD'99 Cup competition.
Food Image Recognition Using Covariance of Convolutional Layer Feature Maps
Atsushi TATSUMA Masaki AONO

LETTER-Image Recognition, Computer Vision

Pubricized:
2016/02/23
Vol:
E99-D No:6
Page(s):
1711-1715
Recent studies have obtained superior performance in image recognition tasks by using, as an image representation, the fully connected layer activations of Convolutional Neural Networks (CNN) trained with various kinds of images. However, the CNN representation is not very suitable for fine-grained image recognition tasks involving food image recognition. For improving performance of the CNN representation in food image recognition, we propose a novel image representation that is comprised of the covariances of convolutional layer feature maps. In the experiment on the ETHZ Food-101 dataset, our method achieved 58.65% averaged accuracy, which outperforms the previous methods such as the Bag-of-Visual-Words Histogram, the Improved Fisher Vector, and CNN-SVM.
Development of an Estimation Model for Instantaneous Presence in Audio-Visual Content
Kenji OZAWA Shota TSUKAHARA Yuichiro KINOSHITA Masanori MORISE

PAPER

Pubricized:
2015/10/21
Vol:
E99-D No:1
Page(s):
120-127
The sense of presence is often used to evaluate the performances of audio-visual (AV) content and systems. However, a presence meter has yet to be realized. We consider that the sense of presence can be divided into two aspects: system presence and content presence. In this study we focused on content presence. To estimate the overall presence of a content item, we have developed estimation models for the sense of presence in audio-only and audio-visual content. In this study, the audio-visual model is expanded to estimate the instantaneous presence in an AV content item. Initially, we conducted an evaluation experiment of the presence with 40 content items to investigate the relationship between the features of the AV content and the instantaneous presence. Based on the experimental data, a neural-network-based model was developed by expanding the previous model. To express the variation in instantaneous presence, 6 audio-related features and 14 visual-related features, which are extracted from the content items in 500-ms intervals, are used as inputs for the model. The audio-related features are loudness, sharpness, roughness, dynamic range and standard deviation in sound pressure levels, and movement of sound images. The visual-related features involve hue, lightness, saturation, and movement of visual images. After constructing the model, a generalization test confirmed that the model is sufficiently accurate to estimate the instantaneous presence. Hence, the model should contribute to the development of a presence meter.
Supervised Denoising Pre-Training for Robust ASR with DNN-HMM
Shin Jae KANG Kang Hyun LEE Nam Soo KIM

LETTER-Speech and Hearing

Pubricized:
2015/09/07
Vol:
E98-D No:12
Page(s):
2345-2348
In this letter, we propose a novel supervised pre-training technique for deep neural network (DNN)-hidden Markov model systems to achieve robust speech recognition in adverse environments. In the proposed approach, our aim is to initialize the DNN parameters such that they yield abstract features robust to acoustic environment variations. In order to achieve this, we first derive the abstract features from an early fine-tuned DNN model which is trained based on a clean speech database. By using the derived abstract features as the target values, the standard error back-propagation algorithm with the stochastic gradient descent method is performed to estimate the initial parameters of the DNN. The performance of the proposed algorithm was evaluated on Aurora-4 DB, and better results were observed compared to a number of conventional pre-training methods.
Uniqueness Theorem of Complex-Valued Neural Networks with Polar-Represented Activation Function
Masaki KOBAYASHI

PAPER-Nonlinear Problems

Vol:
E98-A No:9
Page(s):
1937-1943
Several models of feed-forward complex-valued neural networks have been proposed, and those with split and polar-represented activation functions have been mainly studied. Neural networks with split activation functions are relatively easy to analyze, but complex-valued neural networks with polar-represented functions have many applications but are difficult to analyze. In previous research, Nitta proved the uniqueness theorem of complex-valued neural networks with split activation functions. Subsequently, he studied their critical points, which caused plateaus and local minima in their learning processes. Thus, the uniqueness theorem is closely related to the learning process. In the present work, we first define three types of reducibility for feed-forward complex-valued neural networks with polar-represented activation functions and prove that we can easily transform reducible complex-valued neural networks into irreducible ones. We then prove the uniqueness theorem of complex-valued neural networks with polar-represented activation functions.
A Cascade System of Dynamic Binary Neural Networks and Learning of Periodic Orbit
Jungo MORIYASU Toshimichi SAITO

PAPER

Pubricized:
2015/06/22
Vol:
E98-D No:9
Page(s):
1622-1629
This paper studies a cascade system of dynamic binary neural networks. The system is characterized by signum activation function, ternary connection parameters, and integer threshold parameters. As a fundamental learning problem, we consider storage and stabilization of one desired binary periodic orbit that corresponds to control signals of switching circuits. For the storage, we present a simple method based on the correlation learning. For the stabilization, we present a sparsification method based on the mutation operation in the genetic algorithm. Using the Gray-code-based return map, the storage and stability can be investigated. Performing numerical experiments, effectiveness of the learning method is confirmed.
A Breast Cancer Classifier Using a Neuron Model with Dendritic Nonlinearity
Zijun SHA Lin HU Yuki TODO Junkai JI Shangce GAO Zheng TANG

PAPER-Biocybernetics, Neurocomputing

Pubricized:
2015/04/16
Vol:
E98-D No:7
Page(s):
1365-1376
Breast cancer is a serious disease across the world, and it is one of the largest causes of cancer death for women. The traditional diagnosis is not only time consuming but also easily affected. Hence, artificial intelligence (AI), especially neural networks, has been widely used to assist to detect cancer. However, in recent years, the computational ability of a neuron has attracted more and more attention. The main computational capacity of a neuron is located in the dendrites. In this paper, a novel neuron model with dendritic nonlinearity (NMDN) is proposed to classify breast cancer in the Wisconsin Breast Cancer Database (WBCD). In NMDN, the dendrites possess nonlinearity when realizing the excitatory synapses, inhibitory synapses, constant-1 synapses and constant-0 synapses instead of being simply weighted. Furthermore, the nonlinear interaction among the synapses on a dendrite is defined as a product of the synaptic inputs. The soma adds all of the products of the branches to produce an output. A back-propagation-based learning algorithm is introduced to train the NMDN. The performance of the NMDN is compared with classic back propagation neural networks (BPNNs). Simulation results indicate that NMDN possesses superior capability in terms of the accuracy, convergence rate, stability and area under the ROC curve (AUC). Moreover, regarding ROC, for continuum values, the existing 0-connections branches after evolving can be eliminated from the dendrite morphology to release computational load, but with no influence on the performance of classification. The results disclose that the computational ability of the neuron has been undervalued, and the proposed NMDN can be an interesting choice for medical researchers in further research.

81-100hit(287hit)

Keyword Search Result

[Keyword] neural networks(287hit)

Deep Neural Network Based Monaural Speech Enhancement with Low-Rank Analysis and Speech Present Probability

Corpus Expansion for Neural CWS on Microblog-Oriented Data with λ-Active Learning Approach

End-to-End Exposure Fusion Using Convolutional Neural Network

Daily Activity Recognition with Large-Scaled Real-Life Recording Datasets Based on Deep Neural Network Using Multi-Modal Signals

Feature Adaptive Correlation Tracking

An Improved Supervised Speech Separation Method Based on Perceptual Weighted Deep Recurrent Neural Networks

Multi-Channel Convolutional Neural Networks for Image Super-Resolution

Joint Optimization of Perceptual Gain Function and Deep Neural Networks for Single-Channel Speech Enhancement

Using a Single Dendritic Neuron to Forecast Tourist Arrivals to Japan

Global Hyperbolic Hopfield Neural Networks

Speeding up Deep Neural Networks in Speech Recognition with Piecewise Quantized Sigmoidal Activation Function

Steady-versus-Transient Plot for Analysis of Digital Maps

Speaker Adaptive Training Localizing Speaker Modules in DNN for Hybrid DNN-HMM Speech Recognizers

Design of Multilevel Hybrid Classifier with Variant Feature Sets for Intrusion Detection System

Food Image Recognition Using Covariance of Convolutional Layer Feature Maps

Development of an Estimation Model for Instantaneous Presence in Audio-Visual Content

Supervised Denoising Pre-Training for Robust ASR with DNN-HMM

Uniqueness Theorem of Complex-Valued Neural Networks with Polar-Represented Activation Function

A Cascade System of Dynamic Binary Neural Networks and Learning of Periodic Orbit

A Breast Cancer Classifier Using a Neuron Model with Dendritic Nonlinearity

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles